Automatic Stencil Code Generation- Ph.D. Thesis Proposal
نویسنده
چکیده
Stencil-based kernels constitute the core of many scientific applications on block-structured grids. These calculations form the basis for a wide range of scientific applications from simple Jacobi iterations to complex multigrid and block structured adaptive PDE solvers. Unfortunately, these codes achieve a low fraction of peak performance, due primarily to the disparity between processor and main memory speeds. I propose for my Ph.D. dissertation research to develop an automatic system to generate highly efficient, platform-adapted implementations of stencil kernels. In practice, performance is a complex function of many factors, including compiler technology, machine architecture, instruction scheduling, and memory access behavior. However, through the use of performance models and search, we can generate very good, if not optimal stencil code. This tuned code can be over twice as fast as untuned code.
منابع مشابه
Enabling efficient stencil code generation in OpenACC
The OpenACC programming model simplifies the programming for accelerator devices such as GPUs. Its abstract accelerator model defines a least common denominator for accelerator devices, thus it cannot represent architectural specifics of these devices without losing portability. Therefore, this general-purpose approach delivers good performance on average, but it misses optimization opportuniti...
متن کاملAuto-tuning Stencil Codes for Cache-Based Multicore Platforms
Auto-tuning Stencil Codes for Cache-Based Multicore Platforms by Kaushik Datta Doctor of Philosophy in Computer Science University of California, Berkeley Professor Katherine A. Yelick, Chair As clock frequencies have tapered off and the number of cores on a chip has taken off, the challenge of effectively utilizing these multicore systems has become increasingly important. However, the diversi...
متن کاملWave Equation Based Stencil Optimizations on Multi-core CPU
As the engine for seismic imaging algorithms, stencil kernels modeling wave propagation are both computeand memoryintensive. This work targets improving the performance of wave equation based stencil code parallelized by OpenMP on a multi-core CPU. To achieve this goal, we explored two techniques: improving vectorization by using hardware SIMD technology, and reducing memory traffic to mitigate...
متن کاملPh.D. Proposal: Automatic Repair of Loops
This PhD topic is about automatic software repair. Automatic software repair is the process of fixing software bugs automatically. Research on automatic software repair has recently started, esp. since the invention of GenProg, an automatic repair system for C code [3]. We have been successfully contributing to this field [4, 5, 6, 1]. The PhD student will explore how to automatically repair a ...
متن کاملGPU-UniCache: Automatic Code Generation of Spatial Blocking for Stencils on GPUs
Spatial blocking is a critical memory-access optimization to efficiently exploit the computing resources of parallel processors, such as many-core GPUs. By reusing cache-loaded data over multiple spatial iterations, spatial blocking can significantly lessen the pressure of accessing slow global memory. Stencil computations, for example, can exploit such data reuse via spatial blocking through t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007